Search CORE

3,382 research outputs found

An illustration of the risk of borrowing information via a shared likelihood

Author: Hahn P. Richard
Publication venue
Publication date: 23/05/2019
Field of study

A concrete, stylized example illustrates that inferences may be degraded, rather than improved, by incorporating supplementary data via a joint likelihood. In the example, the likelihood is assumed to be correctly specified, as is the prior over the parameter of interest; all that is necessary for the joint modeling approach to suffer is misspecification of the prior over a nuisance parameter

arXiv.org e-Print Archive

A Structural Approach to Coordinate-Free Statistics

Author: Hahn P. Richard
LaGatta Tom
Publication venue
Publication date: 05/05/2014
Field of study

We consider the question of learning in general topological vector spaces. By exploiting known (or parametrized) covariance structures, our Main Theorem demonstrates that any continuous linear map corresponds to a certain isomorphism of embedded Hilbert spaces. By inverting this isomorphism and extending continuously, we construct a version of the Ordinary Least Squares estimator in absolute generality. Our Gauss-Markov theorem demonstrates that OLS is a "best linear unbiased estimator", extending the classical result. We construct a stochastic version of the OLS estimator, which is a continuous disintegration exactly for the class of "uncorrelated implies independent" (UII) measures. As a consequence, Gaussian measures always exhibit continuous disintegrations through continuous linear maps, extending a theorem of the first author. Applying this framework to some problems in machine learning, we prove a useful representation theorem for covariance tensors, and show that OLS defines a good kriging predictor for vector-valued arrays on general index spaces. We also construct a support-vector machine classifier in this setting. We hope that our article shines light on some deeper connections between probability theory, statistics and machine learning, and may serve as a point of intersection for these three communities.Comment: 31 page

arXiv.org e-Print Archive

Shrinkage priors for linear instrumental variable models with many instruments

Author: Hahn P. Richard
Lopes Hedibert
Publication venue
Publication date: 03/08/2014
Field of study

This paper addresses the weak instruments problem in linear instrumental variable models from a Bayesian perspective. The new approach has two components. First, a novel predictor-dependent shrinkage prior is developed for the many instruments setting. The prior is constructed based on a factor model decomposition of the matrix of observed instruments, allowing many instruments to be incorporated into the analysis in a robust way. Second, the new prior is implemented via an importance sampling scheme, which utilizes posterior Monte Carlo samples from a first-stage Bayesian regression analysis. This modular computation makes sensitivity analyses straightforward. Two simulation studies are provided to demonstrate the advantages of the new method. As an empirical illustration, the new method is used to estimate a key parameter in macro-economic models: the elasticity of inter-temporal substitution. The empirical analysis produces substantive conclusions in line with previous studies, but certain inconsistencies of earlier analyses are resolved.Comment: 27 pages, 6 figures, 3 table

arXiv.org e-Print Archive

Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective

Author: Carvalho Carlos M.
Hahn P. Richard
Publication venue
Publication date: 03/08/2014
Field of study

Selecting a subset of variables for linear models remains an active area of research. This paper reviews many of the recent contributions to the Bayesian model selection and shrinkage prior literature. A posterior variable selection summary is proposed, which distills a full posterior distribution over regression coefficients into a sequence of sparse linear predictors.Comment: 30 pages, 6 figures, 2 table

arXiv.org e-Print Archive

Predictor-dependent shrinkage for linear regression via partial factor modeling

Author: Carvalho Carlos
Hahn P. Richard
Mukherjee Sayan
Publication venue
Publication date: 16/11/2010
Field of study

In prediction problems with more predictors than observations, it can sometimes be helpful to use a joint probability model,

\pi(Y,X)

, rather than a purely conditional model,

\pi(Y \mid X)

, where

Y

is a scalar response variable and

X

is a vector of predictors. This approach is motivated by the fact that in many situations the marginal predictor distribution

\pi(X)

can provide useful information about the parameter values governing the conditional regression. However, under very mild misspecification, this marginal distribution can also lead conditional inferences astray. Here, we explore these ideas in the context of linear factor models, to understand how they play out in a familiar setting. The resulting Bayesian model performs well across a wide range of covariance structures, on real and simulated data.Comment: 16 pages, 1 figure, 2 table

arXiv.org e-Print Archive

CiteSeerX

Regret-based Selection for Sparse Dynamic Portfolios

Author: Carvalho Carlos
Hahn P. Richard
Puelz David
Publication venue
Publication date: 23/07/2017
Field of study

This paper considers portfolio construction in a dynamic setting. We specify a loss function comprised of utility and complexity components with an unknown tradeoff parameter. We develop a novel regret-based criterion for selecting the tradeoff parameter to construct optimal sparse portfolios over time

arXiv.org e-Print Archive

Efficient sampling for Gaussian linear regression with arbitrary priors

Author: Hahn P. Richard
He Jingyu
Lopes Hedibert
Publication venue
Publication date: 14/06/2018
Field of study

This paper develops a slice sampler for Bayesian linear regression models with arbitrary priors. The new sampler has two advantages over current approaches. One, it is faster than many custom implementations that rely on auxiliary latent variables, if the number of regressors is large. Two, it can be used with any prior with a density function that can be evaluated up to a normalizing constant, making it ideal for investigating the properties of new shrinkage priors without having to develop custom sampling algorithms. The new sampler takes advantage of the special structure of the linear regression likelihood, allowing it to produce better effective sample size per second than common alternative approaches

arXiv.org e-Print Archive

XBART: Accelerated Bayesian Additive Regression Trees

Author: Hahn P. Richard
He Jingyu
Yalov Saar
Publication venue
Publication date: 14/03/2019
Field of study

Bayesian additive regression trees (BART) (Chipman et. al., 2010) is a powerful predictive model that often outperforms alternative models at out-of-sample prediction. BART is especially well-suited to settings with unstructured predictor variables and substantial sources of unmeasured variation as is typical in the social, behavioral and health sciences. This paper develops a modified version of BART that is amenable to fast posterior estimation. We present a stochastic hill climbing algorithm that matches the remarkable predictive accuracy of previous BART implementations, but is many times faster and less memory intensive. Simulation studies show that the new method is comparable in computation time and more accurate at function estimation than both random forests and gradient boosting

arXiv.org e-Print Archive

Variable Selection in Seemingly Unrelated Regressions with Random Predictors

Author: Carvalho Carlos
Hahn P. Richard
Puelz David
Publication venue
Publication date: 03/06/2016
Field of study

This paper considers linear model selection when the response is vector-valued and the predictors are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a "post-inference model summarization" strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pricing

arXiv.org e-Print Archive

A Bayesian hierarchical model for inferring player strategy types in a number guessing game

Author: Goswami Indranil
Hahn P. Richard
Mela Carl
Publication venue
Publication date: 16/09/2014
Field of study

This paper presents an in-depth statistical analysis of an experiment designed to measure the extent to which players in a simple game behave according to a popular behavioral economic model. The p-beauty contest is a multi-player number guessing game that has been widely used to study strategic behavior. This paper describes beauty contest experiments for an audience of data analysts, with a special focus on a class of models for game play called k-step thinking models, which allow each player in the game to employ an idiosyncratic strategy. We fit a Bayesian statistical model to estimate the proportion of our player population whose game play is compatible with a k-step thinking model. Our findings put this number at approximately 25%.Comment: 46 pages, 14 figures, 2 table

arXiv.org e-Print Archive